Modal Clustering in a Univariate Class of Product Partition Models

نویسنده

  • David B. Dahl
چکیده

This paper presents an algorithm for finding the maximum a posteriori (MAP) clustering in a class of univariate product partition models. While the number of possible clusterings of n observations grows according to the Bell exponential number, the dynamic programming algorithm presented here exploits properties of the model to provide an O(n2) search. Hence, the algorithm can be used to find the MAP clustering for tens of thousands of univariate data points, whereas previously it could only be approximated through a stochastic search. Integrating over the latent location variables in a Dirichlet Process mixture (DPM) model leads to a product partition model. The paper shows that several univariate, conjugate DPM mixture models satisfy the conditions for the mode-finding algorithm. The clustering algorithm is demonstrated with using data from a microarray experiment to detect differential gene expression involving 4,608 genes.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Modal Clustering in Univariate, Conjugate Dirichlet Process Mixture Models

The Dirichlet Process mixture (DPM) model is a popular nonparametric Bayesian tool for modeling unknown distributions through mixtures of components. Integrating out the latent location variables in a DPM model leads to a product partition model. This paper describes a modefinding algorithm which quickly finds either the maximizer of the partition posterior or the maximizer of the partition lik...

متن کامل

A partition-based algorithm for clustering large-scale software systems

Clustering techniques are used to extract the structure of software for understanding, maintaining, and refactoring. In the literature, most of the proposed approaches for software clustering are divided into hierarchical algorithms and search-based techniques. In the former, clustering is a process of merging (splitting) similar (non-similar) clusters. These techniques suffered from the drawba...

متن کامل

Modeling Stock Market Volatility Using Univariate GARCH Models: Evidence from Bangladesh

This paper investigates the nature of volatility characteristics of stock returns in the Bangladesh stock markets employing daily all share price index return data of Dhaka Stock Exchange (DSE) and Chittagong Stock Exchange (CSE) from 02 January 1993 to 27 January 2013 and 01 January 2004 to 20 August 2015 respectively.  Furthermore, the study explores the adequate volatility model for the stoc...

متن کامل

Improving Decision Trees by Clustering

Multi-modal classification problems arise in many fields and form an important class of problems. The presence of disjoint areas for each class creates special problems for techniques that cannot partition each class into more than one region. Among the various techniques that have been applied with some success to multi-modal problems are decision tree classifiers (DTCs) and back propagation n...

متن کامل

A Predictive View of Bayesian Clustering

This work considers probability models for partitions of a set of n elements using a predictive approach, i.e., models that are specified in terms of the conditional probability of either joining an already existing cluster or forming a new one. The inherent structure can be motivated by resorting to hierarchical models of either parametric or nonparametric nature. Parametric examples include t...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003